RadStat: An open-source statistical analysis tool for counts obtained by a GM counter

The interaction of ionizing radiation with matter is a stochastic process and statistical analysis of such a process would be a crucial step in understanding radioactivity. Geiger–Müller (GM) counter is a widely used radiation detector used in nuclear radiation surveying, which produces counts upon exposure to a radioactive source. There are a variety of multi-purpose software that can be used to perform statistical analysis of measured counts from a GM counter. However, statistical analysis is a lengthy, error prone and time-consuming process, which gets more tedious when the number of measurements increases. In the present work, we have developed an open-source and easy-to-use graphical user interface (GUI) computer program named RadStat for statistical analysis of counts measured by a GM counter. RadStat has its own scripting syntaxes and bundled with gnuplot for quick visualization of output results. We believe the present open-source GUI program would be a useful tool for research and teaching of nuclear radiation physics.


Introduction
Nuclear radiation is ubiquitous in our daily life, in that we are exposed to radiation on a daily basis from various sources [1][2][3][4][5]. The interaction of radiation with matter is a stochastic process, which means statistical analysis of the process would be a crucial step in understanding the underlying physics of such interactions [6,7]. For example, Geiger-Müller (GM) counter which is one of the most commonly used detectors for surveying nuclear radiation [8][9][10][11], provides information on the count or count rate upon exposure to a radiation source. The produced counts would be meaningless on their own and statistical analysis would be required to provide physical meanings to these count readings. The basic information was provided by the mean counts with their associated standard deviations that estimated the most probable count and the corresponding spread of the data, respectively.
Once the counts from a radiation source are obtained, it is useful to check the distribution of the counts against Poisson distribution by first putting the counts into fixed sized bins (i.e., performing binning) to ensure that the experimental measurements have been properly a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 performed. In addition, it is also common to approximate the Poisson distribution with the normal distribution using the mean counts and the standard deviation [12,13]. The average count is obtained from the measured experimental counts and then the standard error would be calculated according to the sample size. Individual count readings can be checked against 1σ (� x � p � x) and 2σ (� x � 2 p � x) limits [14]. The abovementioned statistical analysis is a lengthy, error prone and time-consuming process, which gets more tedious when the number of measurements increases. To the best of our knowledge, there is no dedicated open-source and easy-to-use software that is tailor-made for statistical analysis of GM counter data, and most researchers and students use tools such as Microsoft Excel, MATLAB, and other multi-purpose software to carry out statistical analysis of their experimental data. In the present work, we have developed an open-source graphical user interface (GUI) program named RadStat for statistical analysis of GM counter data. Rad-Stat has its own specific scripting language to perform user defined statistical analysis. RadStat has been bundled with gnuplot (http://www.gnuplot.info/) for ease of plotting and visualization of the results. There are variety of tools and software that can perform general statistical analysis such as SPSS Statistics [15], SAS/STAT [16], Stata [17], Minitab [18] and many other packages. These software packages are for general statistical analysis and not dedicated to statistics of nuclear radiation. In addition, there are other tools that were developed for statistical analysis of nuclear radiation. For example, ROOT [19] is a powerful software framework written in C++ programming language that provides statistical analysis, data processing and visualization, however users need to learn and use C++ programming language to communicate with the program. The ADAQ framework [20] uses C++ and Python libraries and has been designed to streamline the acquisition and analysis of radiation detector data produced in digital data acquisition (DAQ) systems and in Monte Carlo detector simulations; this software mainly focuses on data acquisition and lacks in-depth statistical analysis functions. InterSpec [21] is a native or web application to assist in analyzing spectral nuclear radiation data, using a peak-based methodology; this software would be useful in identifying radionuclides, source age and activity, shielding amounts and dose rate calculations by feeding the obtained spectrum into the software. In comparison, RadStat is easy to use, since no programming knowledge would be required, and the statistical features provided by RadStat would be useful for early learners in this field to verify the statistics involved in nuclear radiation measurements.
Radioactivity and interaction of radiation with matter is a stochastic process. The variations in the recorded counts from a radioactive source is the result of this stochasticity. It would be important to understand the extent of this variation and the count that best represents the activity of the radioactive sample. Theoretically the measured counts from a radioactive source followed the Poisson distribution, which under most conditions can be approximated using the normal distribution. The obtained raw data (i.e., counts) would be meaningless without statistical analysis. For example, the noise corresponding to statistical fluctuation in the counts obtained through nuclear radiation measurements could easily lead to false positives or false negatives if the data have not been rigorously checked using proper statistics. The statistical analysis tools offered by RadStat program would be useful to further verify the statistics involved in nuclear radiation measurements. We believe the present program would be useful for statistical analysis of GM counter data and can be particularly useful and beneficial for teaching and learning activities for young researchers and students in the field of nuclear radiation physics.

Material and methods
RadStat was written in the FORTRAN90 programming language. Microsoft PowerStation QuickWin run-time libraries and Microsoft Fortran libraries (using MSFLIB) were used in building the GUI [22,23]. The FORTRAN90 programming language was chosen mainly due to its fast computational speed and numerical stability. In addition, the use FORTRAN90 programming language would make the task easier to bundle RadStat with Monte Carlo packages such as Monte Carlo N-Particle (MCNP) and Particle and Heavy Ion Transport code System (PHITS) radiation transport codes that were written in FORTRAN programming language. In addition, gnuplot uses Qt5 libraries for plotting the output data. RadStat is a portable computer program that does not require installation and it has a size on disk of~492 KB. In the present work, we have developed our own syntaxes for a variety of widely used statistical functions. This makes statistical analysis easier to perform, and can thus facilitate pedagogical applications. Using these scripts, the users do not need to write their own computer programs, which can be relieving to those users who are not good at computer programming. These scripting syntaxes are summarized in Table 1.
The syntaxes programmed into RadStat should be written in a dedicated script data file for the program to read. The user can use any of the commands shown in Table 1 to perform a specific statistical analysis. These scripting syntaxes would be read as string and then detected by the program after reading the script data file. In addition to these scripting syntaxes, the GUI would also take some inputs such as raw data number, raw data name, output file name, bin and sample size, which are shown in Fig 1. These GUI inputs were read in the form of 256-character lengths and were then converted to floating point 4-byte real numbers. RadStat has been bundled with gnuplot (http://www. gnuplot.info) to plot the binned counts from the experimental measurements and to provide theoretical counts using Poisson and normal distributions. Estimation from the Poisson distribution was calculated as where p pois (x) was the probability getting the value x from the Poisson distribution with the mean value λ. It is remarked that for large x values it would be tedious to numerically calculate the factorial of x as needed in Eq (1). In order to circumvent this, we used the Stirling's approximation [24,25] to numerically compute the factorial. The method provided a close approximation to the computed probability from the Poisson distribution. The Poisson distribution could also be approximated using the normal distribution as where p norm (x) was the probability from the normal distribution of x, μ was the mean and σ was the standard deviation. The use of gnuplot in combination with RadStat would make the comparison much easier through graphical representation. The program outputs the plotting data in gnuplot format after the plot button has been clicked. The fluctuations arising from the counting statistics represent an unavoidable source of uncertainty in all nuclear radiation measurements; these fluctuations can be quantified and compared with the predictions from statistical distributions such as Poisson and normal distribution [26]. Previously, Tsoulfanidis and Landsberger [27] provided a detailed description on the statistical analysis process in nuclear radiation measurements and provided all the required steps; these were implemented in the RadStat computer program. To the best of our knowledge, there is no software like RadStat computer program which is tailor-made for statistical analysis of counts from nuclear radiation measurements that has the ease of use, functionality and its own syntaxes.

PLOS ONE
The RadStat program and its source code could be downloaded from https://figshare.com/ articles/software/RadStat_An_open-source_statistical_analysis_tool_for_counts_obtained_ by_a_GM_counter/17876057. The current version of our program is executable on Microsoft Windows systems. In addition, we have tested the present program in GNU/Linux environment using Wine and it worked well. RadStat does not require any installation and it can run in portable mode. RadStat is distributed under GNU General Public License version 3 (GNU GPLv3). We have used measurements from a 90 Sr radioactive source to test the present computer program. A total of 100 measurements each obtained within 10 second-intervals were measured using the ST360 GM counter system. The GM tube voltage was set at 840 V and the total counts (source + background radiation) were measured. If our objective was to determine the "net count" (i.e., "gross count"-"background count") due to the source itself, the "background count" should be subtracted, in particular for relatively weak sources. However, the objective of the example here was to study the distribution of the gross count, which followed the Poisson distribution, so the "background count" was not subtracted.
The RadStat program cannot analyze the data in real-time; this is mainly due to the fact that entire count arrays must be read by the program and specific array size must be allocated to the memory prior to runtime. The data from the ST360 counter was recorded and transferred to the computer manually. The source activity was 407 Bq and the source to detector distance was 2 cm.

Results and discussion
The 100 measured counts are shown in Table 2. These counts were used as input data to test our RadStat program. All commands shown in Table 1 were used to demonstrate the capability of RadStat. Three different cases were chosen for this numerical test, namely, (1) bin size of 5 and sample size of 10, (2) bin size of 10 and sample size of 20 and (3) bin size of 2 and sample size of 5.
The direct output of statistical analysis from RadStat for the measured data is shown in Box 1. In this analysis, bin size of 5 and sample size of 10 were used. All commands shown in Table 1 were used to demonstrate the full capability of RadStat. The output from RadStat provides the users with all required statistical analyses for the measured experimental data. The RadStat program produces Google search in an automatic manner, which helps the users to get more information on each topic as necessary.
The experimental results were binned with a size of 5 (lines 36 to 48 in Box 1). The bins, cumulative number, number in each bin, estimation from Poisson distribution and approximation with normal distribution are summarized in Table 3. The estimations from Poisson and normal distributions for the number count in each bin were close to those measured We were inspired to develop this project and computer program after teaching nuclear radiation labs for many years at City University of Hong Kong. The present tool and scripting language would be useful to familiarize the students and young researchers with the concept of statistics of nuclear radiation. The program aims at analysis of GM counter data. RadStat loads in the measured experimental data and performs the required statistical analysis.
11. -Some GM counter tips-12. A Geiger counter (Geiger-Muller tube) is a device used for the detection and measurement of all types of ionizing radiation: alpha, beta and gamma radiation. Basically, it consists of a pair of electrodes surrounded by a gas. The electrodes have a high voltage across them. The gas used is usually Helium or Argon. When radiation enters the tube, it can ionize the gas. The ions (and electrons) are attracted to the electrodes and an electric current is produced. A scaler counts the current pulses, and one obtains a "count" whenever radiation ionizes the gas. The apparatus consists of two parts, the tube and the (counter + power supply). The Geiger-Mueller tube is usually cylindrical, with a wire down the center. The (counter + power supply) have voltage controls and timer options. A high voltage is established across the cylinder and the wire as shown on the page of figures. When ionizing radiation such as an alpha, beta or gamma particle enters the tube, it can ionize some of the gas molecules in the tube. From these ionized atoms, an electron is knocked out of the atom, and the remaining atom is positively charged. The high voltage in the tube produces an electric field inside the tube. The electrons that were knocked out of the atom are attracted to the positive electrode, and the positively charged ions are attracted to the negative electrode. This produces a pulse of current in the wires connecting the electrodes, and this pulse is counted. After the pulse is counted, the charged ions become neutralized, and the Geiger counter is ready to record. another pulse. In order for the Geiger counter tube to restore itself quickly to its original state after radiation has entered, a gas is added to the tube [28]. In conclusion, the Geiger-Muller counter is cool;) copy the link into your browser for a quick google search: 13. https://www.google.com/search?q=geiger+muller+counter 14.
-How data mean/average is calculated?-15. mean: sum of a collection of numbers divided by the count of numbers in the collection [29]. Here, the collection refers to the inputted measured data into the program. Simply, sum all the data points and divide it by the total number of data points. copy the link into your browser for a quick google search: experimentally. Since the total number of experimental measurements was 100, the cumulative counts added up to 100. The obtained results in Table 3 were plotted using gnuplot (that was bundled with the RadStat program) and shown in Fig 2. The difference between the Poisson and normal distribution estimations are also shown in the output of RadStat presented in Box 1 (lines 49 to 62). In addition, based on the obtained differences, the program automatically comments on the obtained differences (line 63). In addition, in Box 1 (lines 64 to 71), the 1σ (� x � p � x) and 2σ (� x � 2 p � x) limits were used to check the percentage of data that fell inside or outside the 1σ and 2σ limits. Furthermore, the results of sampling and sample analysis (sample size set to 10

Conclusions
In the present work, an open-source GUI computer program named RadStat was developed.
RadStat is dedicated to performing full statistical analysis of experimentally data obtained using a GM counter, but it can also be used to analyze data obtained using other radiation detectors. RadStat reads the input data in ascii format and provides binning and sampling functions based on user-defined bin size and sample size. RadStat has its own scripting language, which is simple to use and understand. The program estimates the counts in each bin using Poisson distribution and approximates these using the normal distribution. In addition, x) limits would be performed and the program automatically comments on the results from this check. Finally, the program performs sampling and sample analysis of the user-inputted data. RadStat is a portable computer program which does not require installation. RadStat was bundled with gnuplot to enable quick visualization of the output results. We believe the present open-source GUI program would be a useful tool for research and teaching of nuclear radiation physics and in general radiation science. Recently, we have developed the MCHP (Monte Carlo + Human Phantom) platform to facilitate teaching nuclear radiation physics using MCNP Monte Carlo package [32]. In future works, we aim to bundle RadStat into the MCHP platform; this would be particularly useful to perform statistical analysis of nuclear radiation interaction with human body and organs. Such statistical analysis would help enhance the understanding on the significance of interaction events from different ionizing radiations with human body and organs.